An efficient of Neural Address Predictor applies to Address Vector Quantisation codebook in speech processing

نویسندگان

  • J. Srinonchat
  • S. Danaher
چکیده

Generally characteristic of speech waveform is the continuous signal, which contains of voiced and unvoiced signal. Historically, speech waveform is coded by dividing it into frames; it is typically divided into 30 ms frame length, where each frame is coded separately. Speech is however created by a physical system and is substantially shaped by the vocal tract. As it is physically impossible for the vocal tract to move instantaneously from any state to any given state, trends should exist between successive vocal tract positions. In the coding techniques used in this paper, the vocal tract positions manifest themselves as Vector Quantised LSP coefficients. Although speech coding is an entity in its own right, strong links exist between image compression and speech compression. In this work, the Address-VQ technique which used in the image compression arena, have been applied to the compression of speech coded parameters. Furthermore the technique, called Neural Address Prediction, which is a lossy technique, also applied to encourage further reduce the bit rate. This work exploits the repetitiveness of the attribute of a single speaker to further reduce the bit rate. Preliminary results indicate that approximately more than 33% additional compression is achievable using Neural Address Prediction with Address Vector Quantisation codebook. As Neural Address Prediction is a lossy compression scheme, the error of prediction directly affects to the quality of synthesis speech especially in the voice frames.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient product code vector quantisation using the switched split vector quantiser

In this article, we first review the vector quantiser and discuss its well-known advantages over the scalar quantiser, namely the space-filling advantage, the shape advantage, and the memory advantage. It is important to understand why vector quantisers always perform better than any other quantisation scheme for a given dimension, as this will provide the basis for our investigation on improvi...

متن کامل

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies

In this paper, we examine a coding scheme for quantising feature vectors in a distributed speech recognition environment that is more robust to noise. It consists of a vector quantiser that operates on the logarithmic filterbank energies (LFBEs). Through the use of a perceptually-weighted Euclidean distance measure, which emphasises the LFBEs that represent the spectral peaks, the vector quanti...

متن کامل

An Improved Vector Quantisation Algorithm for Speech Transmission Over Noisy Channels

Vector quantisation (VQ) is a method widely used in low bit-rate coding and transmission of speech signals. Unfortunately, a single bit error in the transmitted index, due to noise in the transmission channel, could degrade perceived speech quality at the receiver quite dramatically, as the reference vector retrieved by the corrupted index may di er greatly from the vector corresponding to the ...

متن کامل

A Fast Index Assignment Algorithm for Vector Quantization over Noisy Transmission Channels

Vector quantisation, a widely used technique in low-bit rate coding of speech signals, is highly sensitive to errors in the transmitted codeword caused by noise in the transmission channel. This paper describes an e cient index assignment algorithm, based on Hall's solution to the quadratic assignment problem, used to re-order the codebook such that the e ect of transmission errors is minimised...

متن کامل

Use of multiple vector quantisation for semicontinuous-HMM speech recognition - Vision, Image and Signal Processing, IEE Proceedings-

Although the continuous hidden Markov model (CHMM) technique seems to be the most flexible and complete tool for speech modelling, it is not always used for the implementation of speech recognition systems because of several problems related to training and computational complexity. Thus, other simpler types of HMMs, such as discrete (DHMM) or semicontinuous (SCHMM) models, are commonly utilise...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004